AITopics | new distribution

A New Distribution on the Simplex with Auto-Encoding Applications

Neural Information Processing SystemsDec-25-2025, 07:33:03 GMT

We construct a new distribution for the simplex using the Kumaraswamy distribution and an ordered stick-breaking process. We explore and develop the theoretical properties of this new distribution and prove that it exhibits symmetry (exchangeability) under the same conditions as the well-known Dirichlet. Like the Dirichlet, the new distribution is adept at capturing sparsity but, unlike the Dirichlet, has an exact and closed form reparameterization--making it well suited for deep variational Bayesian modeling. We demonstrate the distribution's utility in a variety of semi-supervised auto-encoding tasks. In all cases, the resulting models achieve competitive performance commensurate with their simplicity, use of explicit probability models, and abstinence from adversarial training.

auto-encoding application, name change, new distribution, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Reviews: A New Distribution on the Simplex with Auto-Encoding Applications

Neural Information Processing SystemsJan-23-2025, 07:06:48 GMT

Originality: Although VAEs using a stick-breaking construction with Kumaraswamy distributions has been considered before (Nalisnick, Smyth, STICK-BREAKING VARIATIONAL AUTOENCODERS, 2017), the idea to use such a construction and extend it by mixing over the orderings to obtain a density more similar to a Dirichlet is new and interesting. Related work is adequately cited. Quality: The paper seems technically sound and claims are largely supported. Although Theorem 1 is a standard result, reiterating it is likely useful for the subsequent exposition. Experimental results show that the method outperforms some baselines, however, I feel that some additional experiments would be useful (see details below in Section 5. Improvements).

auto-encoding application, new distribution, reparameterization gradient, (6 more...)

Neural Information Processing Systems

Genre: Personal > Interview (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Continual Learning with Strategic Selection and Forgetting for Network Intrusion Detection

Zhang, Xinchen, Zhao, Running, Jiang, Zhihan, Chen, Handi, Ding, Yulong, Ngai, Edith C. H., Yang, Shuang-Hua

arXiv.org Artificial IntelligenceJan-13-2025

Intrusion Detection Systems (IDS) are crucial for safeguarding digital infrastructure. In dynamic network environments, both threat landscapes and normal operational behaviors are constantly changing, resulting in concept drift. While continuous learning mitigates the adverse effects of concept drift, insufficient attention to drift patterns and excessive preservation of outdated knowledge can still hinder the IDS's adaptability. In this paper, we propose SSF (Strategic Selection and Forgetting), a novel continual learning method for IDS, providing continuous model updates with a constantly refreshed memory buffer. Our approach features a strategic sample selection algorithm to select representative new samples and a strategic forgetting mechanism to drop outdated samples. The proposed strategic sample selection algorithm prioritizes new samples that cause the `drifted' pattern, enabling the model to better understand the evolving landscape. Additionally, we introduce strategic forgetting upon detecting significant drift by discarding outdated samples to free up memory, allowing the incorporation of more recent data. SSF captures evolving patterns effectively and ensures the model is aligned with the change of data patterns, significantly enhancing the IDS's adaptability to concept drift. The state-of-the-art performance of SSF on NSL-KDD and UNSW-NB15 datasets demonstrates its superior adaptability to concept drift for network intrusion detection.

concept drift, new sample, old sample, (15 more...)

arXiv.org Artificial Intelligence

2412.16264

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Continuing Education (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

A New Distribution on the Simplex with Auto-Encoding Applications

Neural Information Processing SystemsOct-9-2024, 22:14:48 GMT

We construct a new distribution for the simplex using the Kumaraswamy distribution and an ordered stick-breaking process. We explore and develop the theoretical properties of this new distribution and prove that it exhibits symmetry (exchangeability) under the same conditions as the well-known Dirichlet. Like the Dirichlet, the new distribution is adept at capturing sparsity but, unlike the Dirichlet, has an exact and closed form reparameterization--making it well suited for deep variational Bayesian modeling. We demonstrate the distribution's utility in a variety of semi-supervised auto-encoding tasks. In all cases, the resulting models achieve competitive performance commensurate with their simplicity, use of explicit probability models, and abstinence from adversarial training.

auto-encoding application, dirichlet, new distribution, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

MetaRM: Shifted Distributions Alignment via Meta-Learning

Dou, Shihan, Liu, Yan, Zhou, Enyu, Li, Tianlong, Jia, Haoxiang, Xiong, Limao, Zhao, Xin, Ye, Junjie, Zheng, Rui, Gui, Tao, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceMay-1-2024

The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model shifts, leading to the RM's reduced ability to distinguish between responses. This issue is further compounded when the RM, trained on a specific data distribution, struggles to generalize to examples outside of that distribution. These two issues can be united as a challenge posed by the shifted distribution of the environment. To surmount this challenge, we introduce MetaRM, a method leveraging meta-learning to align the RM with the shifted environment distribution. MetaRM is designed to train the RM by minimizing data loss, particularly for data that can improve the differentiation ability to examples of the shifted target distribution. Extensive experiments demonstrate that MetaRM significantly improves the RM's distinguishing ability in iterative RLHF optimization, and also provides the capacity to identify subtle differences in out-of-distribution samples.

arxiv preprint arxiv, metarm, reward model, (15 more...)

arXiv.org Artificial Intelligence

2405.00438

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LARA: A Light and Anti-overfitting Retraining Approach for Unsupervised Anomaly Detection

Chen, Feiyi, Qin, Zhen, Zhang, Yingying, Deng, Shuiguang, Xiao, Yi, Pang, Guansong, Wen, Qingsong

arXiv.org Artificial IntelligenceJan-29-2024

Most of current anomaly detection models assume that the normal pattern remains same all the time. However, the normal patterns of Web services change dramatically and frequently. The model trained on old-distribution data is outdated after such changes. Retraining the whole model every time is expensive. Besides, at the beginning of normal pattern changes, there is not enough observation data from the new distribution. Retraining a large neural network model with limited data is vulnerable to overfitting. Thus, we propose a Light and Anti-overfitting Retraining Approach (LARA) for deep variational auto-encoder based time series anomaly detection methods (VAEs). This work aims to make three novel contributions: 1) the retraining process is formulated as a convex problem and can converge at a fast rate as well as prevent overfitting; 2) designing a ruminate block, which leverages the historical data without the need to store them; 3) mathematically proving that when fine-tuning the latent vector and reconstructed data, the linear formations can achieve the least adjusting errors between the ground truths and the fine-tuned ones. Moreover, we have performed many experiments to verify that retraining LARA with even 43 time slots of data from new distribution can result in its competitive F1 Score in comparison with the state-of-the-art anomaly detection models trained with sufficient data. Besides, we verify its light overhead.

anomaly detection, dataset, latent vector, (13 more...)

arXiv.org Artificial Intelligence

2310.05668

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Stat Stories: Normalizing Flows as an Application of Variable Transformation

#artificialintelligenceOct-12-2022, 18:33:10 GMT

While other statistical methods such as Generative Adversarial Networks (GAN) and Variational AutoEncoders (VAN) have been able to perform dramatic results on difficult tasks such as learning distributions of images, and other complicated datasets, they do not allow evaluation of density estimation and calculation of probability density of new data points. In such a sense, Normalizing Flows proves to be eloquent. The method can perform density estimation and sampling as well as variational inferences. Consider a transformation u g(x; θ), i.e., g is parametrized by parameter vector θ.

normalizing flow, transformation, variable transformation, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The $r$-value: evaluating stability with respect to distributional shifts

Gupta, Suyash, Rothenhäusler, Dominik

arXiv.org Machine LearningMay-7-2021

Common statistical measures of uncertainty like $p$-values and confidence intervals quantify the uncertainty due to sampling, that is, the uncertainty due to not observing the full population. In practice, populations change between locations and across time. This makes it difficult to gather knowledge that transfers across data sets. We propose a measure of uncertainty that quantifies the distributional uncertainty of a statistical estimand with respect to Kullback-Liebler divergence, that is, the sensitivity of the parameter under general distributional perturbations within a Kullback-Liebler divergence ball. If the signal-to-noise ratio is small, distributional uncertainty is a monotonous transformation of the signal-to-noise ratio. In general, however, it is a different concept and corresponds to a different research question. Further, we propose measures to estimate the stability of parameters with respect to directional or variable-specific shifts. We also demonstrate how the measure of distributional uncertainty can be used to prioritize data collection for better estimation of statistical parameters under shifted distribution. We evaluate the performance of the proposed measure in simulations and real data and show that it can elucidate the distributional (in-)stability of an estimator with respect to certain shifts and give more accurate estimates of parameters under shifted distribution only requiring to collect limited information from the shifted distribution.

marginal distribution, optimal solution, stability, (16 more...)

arXiv.org Machine Learning

2105.03067

Country:

North America > United States > New York (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.92)
Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Modeling & Simulation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

A New Distribution on the Simplex with Auto-Encoding Applications

Stirn, Andrew, Jebara, Tony, Knowles, David

Neural Information Processing SystemsMar-19-2020, 02:16:36 GMT

We construct a new distribution for the simplex using the Kumaraswamy distribution and an ordered stick-breaking process. We explore and develop the theoretical properties of this new distribution and prove that it exhibits symmetry (exchangeability) under the same conditions as the well-known Dirichlet. Like the Dirichlet, the new distribution is adept at capturing sparsity but, unlike the Dirichlet, has an exact and closed form reparameterization--making it well suited for deep variational Bayesian modeling. We demonstrate the distribution's utility in a variety of semi-supervised auto-encoding tasks. In all cases, the resulting models achieve competitive performance commensurate with their simplicity, use of explicit probability models, and abstinence from adversarial training.

artificial intelligence, auto-encoding application, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

Weights & Biases - ML Best Practices: Test Driven Development at Latent Space

#artificialintelligenceOct-28-2019, 18:19:15 GMT

I sat down with the Latent Space team to talk about best practices around collaboration and managing model iteration. In machine learning, bugs may affect the distribution of possible models more than any particular instance, making traditional deterministic tests misleading. Because of this, a test-driven development framework for large ML models must account for the statistical nature of training. This is especially crucial when multiple researchers and engineers are contributing to the same model, as it's easy to silently introduce regressions into a codebase. Here, the team shares some insights about how this new form of test-driven development has been the key to moving quickly on a large-scale collaborative project.

config, latent space, test driven development, (14 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback